Our team is looking at various trends when it comes to housing data, we each did a different part using this housing data in different ways. My part of the project is about affordability of housing in different areas. I chose to use 3 main different places, with a bonus 4th place that I wanted to throw in at the end. The big 3 places are Los Angeles, a mega city, Nashville a large city and Oxford a medium to small city. First I will visualize my Zillow scraped data showing all the different places I scraped. Then I will examine the relationship of how cost effective it is to rent or buy in each of these cities with also examining the difference in price of each area being compared square footage amount of bedrooms and bathrooms to get a general idea on the price of the different areas and then I want to look into a small town around Nashville to see the difference in price being right outside a large city.
Note: The Zillow data is not all of the listings within each area. When scraping, Zillow blocked my attempts to grab all of the listings. Zillow after many attempts would only allow me to grab the first 9 listings per page because Zillow loads new listings once we scroll so when I am pulling from the HTML it cannot see those new listings.
Question 1: What is the average price of homes in each area? (Los Angeles, Nashville, and Oxford)
Before we can begin analyzing affordability across these three very different markets, we need to understand the basic price landscape. Los Angeles represents one of the most expensive housing markets in the entire country - a mega city with millions of residents competing for limited housing stock. Nashville represents a large, growing city that has seen significant population influx in recent years. Oxford, Mississippi is a small college town that offers a stark contrast to the other two markets.
Using the data I scraped from Zillow, here’s a summary of the average prices across all three markets:
Code
# Q1: Average prices by marketmain_markets <- market_summary %>%filter(market %in%c("Los Angeles, CA", "Nashville, TN", "Oxford, MS")) %>%arrange(desc(avg_price))main_markets %>%mutate(avg_price =dollar(avg_price),median_price =dollar(median_price),min_price =dollar(min_price),max_price =dollar(max_price),avg_sqft =comma(round(avg_sqft, 0)),avg_price_per_sqft =dollar(round(avg_price_per_sqft, 0)) ) %>%select(market, city_size, total_listings, avg_price, median_price, avg_sqft, avg_price_per_sqft) %>%kbl(caption ="Average Home Prices by Market - Los Angeles, Nashville, and Oxford",col.names =c("Market", "City Size", "Listings", "Avg Price", "Median Price", "Avg Sq Ft", "Price/Sq Ft"),align =c("l", "l", "c", "r", "r", "r", "r") ) %>%kable_styling(bootstrap_options =c("striped", "hover", "condensed"), full_width =FALSE, position ="center") %>%row_spec(0, bold =TRUE, color ="white", background ="#2C3E50") %>%column_spec(4, bold =TRUE, color ="#E74C3C") %>%footnote(general ="Data scraped from Zillow, November 2025", general_title ="Source: ")
Average Home Prices by Market - Los Angeles, Nashville, and Oxford
Market
City Size
Listings
Avg Price
Median Price
Avg Sq Ft
Price/Sq Ft
Los Angeles, CA
Mega City
204
$5,951,477
$1,324,000
3,540
$1,014
Nashville, TN
Large City
205
$1,046,017
$530,000
2,373
$377
Oxford, MS
Small City
183
$578,600
$488,500
2,337
$246
Source:
Data scraped from Zillow, November 2025
The results immediately confirm what most people would expect: Los Angeles has by far the highest average home prices, followed by Nashville, with Oxford being the most affordable. What’s particularly striking is the price per square foot - in Los Angeles you’re paying significantly more for the same amount of space compared to the other markets.
The difference in average price between Los Angeles and Oxford is substantial. This massive gap illustrates why many families are choosing to relocate from expensive coastal cities to more affordable areas in the South and Midwest.
Question 2: Where are these properties located geographically?
Now that we understand the price differences, I want to visualize where exactly these properties are located across the United States. This will help us see the geographic spread of our data and understand the regional context of each market.
Each property is plotted on the map with colors representing the different markets. You can click on any marker to see details about that property.
Looking at this map, we can see the geographic spread of our three markets. Los Angeles sits on the West Coast, Nashville is in the heart of Tennessee, and Oxford is nestled in northern Mississippi. The navigation buttons allow you to jump directly to each market to explore individual properties.
Question 3: How do home prices compare when we look at the distribution across city sizes?
The average prices tell part of the story, but they can be misleading if there are outliers pulling the numbers up or down. To get a better understanding of the price landscape, I want to look at how prices are distributed in each market. A histogram will show us not just the average, but where most homes actually fall in terms of price.
Code
# Q3: Price distribution histogrammain_market_data <- zillow_filtered %>%filter(market %in%c("Los Angeles, CA", "Nashville, TN", "Oxford, MS")) %>%filter(price <2000000)price_hist <-plot_ly(alpha =0.6) %>%add_histogram(data = main_market_data %>%filter(market =="Los Angeles, CA"),x =~price, name ="Los Angeles (Mega City)", marker =list(color ="#F39C12")) %>%add_histogram(data = main_market_data %>%filter(market =="Nashville, TN"),x =~price, name ="Nashville (Large City)", marker =list(color ="#3498DB")) %>%add_histogram(data = main_market_data %>%filter(market =="Oxford, MS"),x =~price, name ="Oxford (Small City)", marker =list(color ="#9B59B6")) %>%layout(title =list(text ="Home Price Distribution by City Size"),xaxis =list(title ="Price", tickformat ="$,.0f", rangeslider =list(visible =TRUE)),yaxis =list(title ="Number of Properties"),barmode ="overlay",legend =list(x =0.65, y =0.95) )price_hist
The histogram reveals some fascinating patterns. Los Angeles properties are spread across a very wide price range, with many homes priced well over $1 million. This confirms that LA is a premium market where even modest homes command extremely high prices.
Nashville shows a more moderate distribution, with most homes clustering in the $400,000 to $700,000 range. This is still expensive by national standards, but much more accessible than Los Angeles.
Oxford, Mississippi shows a tight cluster of much lower prices, with most homes falling between $200,000 and $400,000. This is the most affordable of our three markets.
Question 4: Is renting or buying more cost-effective in each market?
Understanding home prices is important, but to truly assess affordability, we need to compare the cost of buying versus renting. I obtained Fair Market Rent (FMR) data from the U.S. Department of Housing and Urban Development (HUD) for each of these markets. The FMR represents what HUD considers a reasonable rent for a given area.
By comparing the median home price to the annual rent, we can calculate a “price-to-rent ratio.” This ratio tells us how many years of rent it would take to equal the purchase price of a home. Generally: - A ratio under 15 means buying is favorable - A ratio of 15-20 means it’s roughly equal - A ratio over 20 means renting might be more economically favorable
Code
# Q4: Price-to-rent ratio analysisfmr_lookup <-data.frame(market =c("Los Angeles, CA", "Nashville, TN", "Oxford, MS", "Gallatin, TN"),county =c("Los Angeles County", "Davidson County", "Lafayette County", "Sumner County"),fmr_3br =c(3439, 2143, 1372, 1953))market_with_fmr <- market_summary %>%left_join(fmr_lookup, by ="market") %>%mutate(annual_rent = fmr_3br *12, price_to_rent_ratio =round(median_price / annual_rent, 1))main_with_fmr <- market_with_fmr %>%filter(market %in%c("Los Angeles, CA", "Nashville, TN", "Oxford, MS"))ggplot(main_with_fmr, aes(x =reorder(market, price_to_rent_ratio), y = price_to_rent_ratio, fill = city_size)) +geom_col(width =0.6) +geom_hline(yintercept =15, linetype ="dashed", color ="#27AE60", size =1) +geom_hline(yintercept =20, linetype ="dashed", color ="#F39C12", size =1) +geom_text(aes(label =round(price_to_rent_ratio, 1)), vjust =-0.5, size =4, fontface ="bold") +scale_fill_manual(values =c("#9B59B6", "#3498DB", "#F39C12")) +scale_y_continuous(limits =c(0, 35), breaks =seq(0, 35, 5)) +annotate("text", x =0.6, y =15.5, label ="Favor Buying", color ="#27AE60", size =3, hjust =0) +annotate("text", x =0.6, y =20.5, label ="Consider Renting", color ="#F39C12", size =3, hjust =0) +labs(title ="Price-to-Rent Ratio by Market", subtitle ="Lower ratio = buying is more favorable relative to renting",x ="", y ="Price-to-Rent Ratio (Years)", fill ="City Size") +theme_minimal() +theme(axis.text.x =element_text(size =10), legend.position ="top", plot.title =element_text(face ="bold", size =14)) +coord_flip()
This chart shows the price-to-rent ratio for each market.These thresholds are standard industry guidelines used in rent vs. buy analysis.
Price-to-Rent Analysis: Is Buying or Renting More Cost-Effective?
Market
City Size
Median Price
Monthly Rent (3BR)
Annual Rent
Price/Rent Ratio
Recommendation
Los Angeles, CA
Mega City
$1,324,000
$3,439
$41,268
32.1
Consider Renting
Nashville, TN
Large City
$530,000
$2,143
$25,716
20.6
Consider Renting
Oxford, MS
Small City
$488,500
$1,372
$16,464
29.7
Consider Renting
Oxford, MS has the lowest ratio, making it the most favorable market for buying. With a ratio around 15, the cost of buying is competitive with renting over the long term.
Nashville, TN falls in the middle with a ratio around 20-22. This is borderline - neither option has a clear advantage.
Los Angeles, CA has the highest ratio at around 25+, suggesting that renting might actually be more economical than buying in this expensive market.
Surprising our current home buying market is not in a great state to actually buy a home.
Bonus: What about a small town near a large city?
Up until now, we’ve compared three very different types of cities. But what if you dont want to be paying large city prices? This is Gallatin, Tennessee a surrounding small town ear Nashville TN. It offers access to Nashville’s job market while potentially providing more affordable housing.
Let’s compare Gallatin directly to Nashville. To see if the commute is worth it:
The scatter plot shows how price varies with bedroom count in both markets. The colors represent price categories while the shapes circle(Gallatin) and triangles(Nashville) show the area. This show how affordable each bed room type between these 2 cities.
Nashville vs Gallatin: Does Living in a Small Town Save Money?
Market
City Type
Listings
Median Price
Min Price
Max Price
Avg Sq Ft
Avg Beds
Avg Baths
Price/Sq Ft
Gallatin, TN
Small Town
101
$599,900
$179,900
$3,699,000
2,667
3.5
3.1
$281
Nashville, TN
Large City
205
$530,000
$195,000
$14,900,000
2,373
3.4
3.0
$377
The comparison reveals that Gallatin offers noticeable savings compared to Nashville. While the difference might not be as dramatic but it still allows people to have access to larger cities job market and activities without living in the night life of the city.
Conclusion
After analyzing housing data across Los Angeles (mega city), Nashville (large city), Oxford (small city), and Gallatin (small town), several clear conclusions emerge about housing affordability:
Key Findings:
City size directly correlates with price. Los Angeles homes average significantly more than Nashville, which in turn is more expensive than Oxford. This confirms the intuitive relationship between population density and housing costs.
Price-to-rent ratios vary dramatically. In expensive markets like Los Angeles, renting may actually be more economical than buying when you factor in the extreme home prices. In smaller markets like Oxford, buying becomes much more attractive.
Small towns near large cities offer a middle ground. Gallatin provides access to Nashville’s economy while offering somewhat lower housing costs. This “bedroom community” approach is worth considering for those who don’t need to live in the city center.
Square footage don’t scale linearly with price. In Los Angeles, you pay much more per square foot than in other markets. A modest home in LA might cost more than a mansion in Oxford.
Source Code
---title: "444 Final Project - Housing Data"author: "Brodie Gilbert"date: "11-21-25"format: html: code-fold: true code-tools: true toc: true toc-depth: 3 theme: cosmo embed-resources: true pdf: toc: true toc-depth: 3 geometry: margin=1inexecute: warning: false message: false---# Part 3## IntroductionOur team is looking at various trends when it comes to housing data, we each did a different part using this housing data in different ways. My part of the project is about affordability of housing in different areas. I chose to use 3 main different places, with a bonus 4th place that I wanted to throw in at the end. The big 3 places are Los Angeles, a mega city, Nashville a large city and Oxford a medium to small city. First I will visualize my Zillow scraped data showing all the different places I scraped. Then I will examine the relationship of how cost effective it is to rent or buy in each of these cities with also examining the difference in price of each area being compared square footage amount of bedrooms and bathrooms to get a general idea on the price of the different areas and then I want to look into a small town around Nashville to see the difference in price being right outside a large city.**Note:** The Zillow data is not all of the listings within each area. When scraping, Zillow blocked my attempts to grab all of the listings. Zillow after many attempts would only allow me to grab the first 9 listings per page because Zillow loads new listings once we scroll so when I am pulling from the HTML it cannot see those new listings.```{r setup}#| include: falselibrary(tidyverse)library(readxl)library(leaflet)library(leaflet.extras)library(plotly)library(kableExtra)library(scales)library(tigris)library(sf)options(tigris_use_cache = TRUE)sf_use_s2(FALSE)``````{r load-and-wrangle}#| code-fold: true# Load Zillow data from each marketlos_angeles <- read_csv("data/raw/los_angeles_ca_properties.csv", col_types = cols(.default = "c"))nashville <- read_csv("data/raw/nashville_tn_properties.csv", col_types = cols(.default = "c"))oxford <- read_csv("data/raw/oxford_ms_properties.csv", col_types = cols(.default = "c"))gallatin <- read_csv("data/raw/gallatin_tn_properties.csv", col_types = cols(.default = "c"))# Combine all marketszillow_combined <- bind_rows( los_angeles %>% mutate(market = "Los Angeles, CA", city_size = "Mega City", state = "CA"), nashville %>% mutate(market = "Nashville, TN", city_size = "Large City", state = "TN"), oxford %>% mutate(market = "Oxford, MS", city_size = "Small City", state = "MS"), gallatin %>% mutate(market = "Gallatin, TN", city_size = "Small Town", state = "TN"))# Create market summary statisticsmarket_summary <- zillow_combined %>% mutate( price = as.numeric(gsub("[\\$,]", "", price)), sqft = as.numeric(gsub(",", "", sqft)), bedrooms = as.numeric(bedrooms), bathrooms = as.numeric(bathrooms), year_built = as.numeric(year_built), latitude = as.numeric(latitude), longitude = as.numeric(longitude), price_per_sqft = price / sqft ) %>% filter(!is.na(price), !is.na(sqft), price > 0, sqft > 0) %>% group_by(market, city_size, state) %>% summarize( total_listings = n(), avg_price = mean(price, na.rm = TRUE), median_price = median(price, na.rm = TRUE), min_price = min(price, na.rm = TRUE), max_price = max(price, na.rm = TRUE), avg_sqft = mean(sqft, na.rm = TRUE), avg_bedrooms = mean(bedrooms, na.rm = TRUE), avg_bathrooms = mean(bathrooms, na.rm = TRUE), avg_price_per_sqft = mean(price_per_sqft, na.rm = TRUE), .groups = "drop" )# Clean dataset for property-level analysiszillow_filtered <- zillow_combined %>% mutate( price = as.numeric(gsub("[\\$,]", "", price)), sqft = as.numeric(gsub(",", "", sqft)), bedrooms = as.numeric(bedrooms), bathrooms = as.numeric(bathrooms), year_built = as.numeric(year_built), latitude = as.numeric(latitude), longitude = as.numeric(longitude), price_per_sqft = price / sqft ) %>% filter(!is.na(price), !is.na(sqft), price > 0, sqft > 0)```------------------------------------------------------------------------## Question 1: What is the average price of homes in each area? (Los Angeles, Nashville, and Oxford)Before we can begin analyzing affordability across these three very different markets, we need to understand the basic price landscape. Los Angeles represents one of the most expensive housing markets in the entire country - a mega city with millions of residents competing for limited housing stock. Nashville represents a large, growing city that has seen significant population influx in recent years. Oxford, Mississippi is a small college town that offers a stark contrast to the other two markets.Using the data I scraped from Zillow, here's a summary of the average prices across all three markets:```{r price-summary-table}#| code-fold: true# Q1: Average prices by marketmain_markets <- market_summary %>% filter(market %in% c("Los Angeles, CA", "Nashville, TN", "Oxford, MS")) %>% arrange(desc(avg_price))main_markets %>% mutate( avg_price = dollar(avg_price), median_price = dollar(median_price), min_price = dollar(min_price), max_price = dollar(max_price), avg_sqft = comma(round(avg_sqft, 0)), avg_price_per_sqft = dollar(round(avg_price_per_sqft, 0)) ) %>% select(market, city_size, total_listings, avg_price, median_price, avg_sqft, avg_price_per_sqft) %>% kbl( caption = "Average Home Prices by Market - Los Angeles, Nashville, and Oxford", col.names = c("Market", "City Size", "Listings", "Avg Price", "Median Price", "Avg Sq Ft", "Price/Sq Ft"), align = c("l", "l", "c", "r", "r", "r", "r") ) %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE, position = "center") %>% row_spec(0, bold = TRUE, color = "white", background = "#2C3E50") %>% column_spec(4, bold = TRUE, color = "#E74C3C") %>% footnote(general = "Data scraped from Zillow, November 2025", general_title = "Source: ")```The results immediately confirm what most people would expect: Los Angeles has by far the highest average home prices, followed by Nashville, with Oxford being the most affordable. What's particularly striking is the price per square foot - in Los Angeles you're paying significantly more for the same amount of space compared to the other markets.The difference in average price between Los Angeles and Oxford is substantial. This massive gap illustrates why many families are choosing to relocate from expensive coastal cities to more affordable areas in the South and Midwest.------------------------------------------------------------------------## Question 2: Where are these properties located geographically?Now that we understand the price differences, I want to visualize where exactly these properties are located across the United States. This will help us see the geographic spread of our data and understand the regional context of each market.Each property is plotted on the map with colors representing the different markets. You can click on any marker to see details about that property.```{r us-map}#| code-fold: true# Q2: Interactive map of all propertiesmap_data <- zillow_filtered %>% filter(market %in% c("Los Angeles, CA", "Nashville, TN", "Oxford, MS")) %>% filter(!is.na(latitude), !is.na(longitude))# Custom house icons for each marketmarket_icons <- awesomeIconList( "Los Angeles, CA" = makeAwesomeIcon(icon = "home", markerColor = "orange", library = "fa", iconColor = "white"), "Nashville, TN" = makeAwesomeIcon(icon = "home", markerColor = "blue", library = "fa", iconColor = "white"), "Oxford, MS" = makeAwesomeIcon(icon = "home", markerColor = "purple", library = "fa", iconColor = "white"))oxford_boundary <- places(state = "MS", year = 2022, progress_bar = FALSE) %>% filter(NAME == "Oxford")nashville_boundary <- places(state = "TN", year = 2022, progress_bar = FALSE) %>% filter(NAME == "Nashville-Davidson metropolitan government (balance)")la_boundary <- places(state = "CA", year = 2022, progress_bar = FALSE) %>% filter(NAME == "Los Angeles")us_map <- leaflet(map_data) %>% addProviderTiles(providers$CartoDB.Positron) %>% setView(lng = -89.5192, lat = 34.3665, zoom = 10) %>% addPolygons(data = oxford_boundary, fillColor = "#9B59B6", fillOpacity = 0.15, color = "#9B59B6", weight = 2) %>% addPolygons(data = nashville_boundary, fillColor = "#3498DB", fillOpacity = 0.15, color = "#3498DB", weight = 2) %>% addPolygons(data = la_boundary, fillColor = "#F39C12", fillOpacity = 0.15, color = "#F39C12", weight = 2) %>% addAwesomeMarkers( lng = ~longitude, lat = ~latitude, icon = ~market_icons[market], popup = ~paste0("<b>", market, "</b><br>", address, "<br><b>Price:</b> ", dollar(price), "<br><b>Beds:</b> ", bedrooms, " | <b>Baths:</b> ", bathrooms, "<br><b>Sq Ft:</b> ", comma(sqft)), label = ~paste0(market, ": ", dollar(price)), clusterOptions = markerClusterOptions() ) %>% addLegend(position = "bottomright", colors = c("#F69730", "#38AADD", "#9B59B6"), labels = c("Los Angeles, CA", "Nashville, TN", "Oxford, MS"), title = "Markets", opacity = 0.8) %>% addEasyButton(easyButton(icon = htmltools::HTML("<b>LA</b>"), title = "Los Angeles", onClick = JS("function(btn, map){ map.setView([34.0522, -118.2437], 10); }"))) %>% addEasyButton(easyButton(icon = htmltools::HTML("<b>NSH</b>"), title = "Nashville", onClick = JS("function(btn, map){ map.setView([36.1627, -86.7816], 11); }"))) %>% addEasyButton(easyButton(icon = htmltools::HTML("<b>OXF</b>"), title = "Oxford", onClick = JS("function(btn, map){ map.setView([34.3665, -89.5192], 12); }")))us_map```Looking at this map, we can see the geographic spread of our three markets. Los Angeles sits on the West Coast, Nashville is in the heart of Tennessee, and Oxford is nestled in northern Mississippi. The navigation buttons allow you to jump directly to each market to explore individual properties.------------------------------------------------------------------------## Question 3: How do home prices compare when we look at the distribution across city sizes?The average prices tell part of the story, but they can be misleading if there are outliers pulling the numbers up or down. To get a better understanding of the price landscape, I want to look at how prices are distributed in each market. A histogram will show us not just the average, but where most homes actually fall in terms of price.```{r price-histogram}#| code-fold: true# Q3: Price distribution histogrammain_market_data <- zillow_filtered %>% filter(market %in% c("Los Angeles, CA", "Nashville, TN", "Oxford, MS")) %>% filter(price < 2000000)price_hist <- plot_ly(alpha = 0.6) %>% add_histogram(data = main_market_data %>% filter(market == "Los Angeles, CA"), x = ~price, name = "Los Angeles (Mega City)", marker = list(color = "#F39C12")) %>% add_histogram(data = main_market_data %>% filter(market == "Nashville, TN"), x = ~price, name = "Nashville (Large City)", marker = list(color = "#3498DB")) %>% add_histogram(data = main_market_data %>% filter(market == "Oxford, MS"), x = ~price, name = "Oxford (Small City)", marker = list(color = "#9B59B6")) %>% layout( title = list(text = "Home Price Distribution by City Size"), xaxis = list(title = "Price", tickformat = "$,.0f", rangeslider = list(visible = TRUE)), yaxis = list(title = "Number of Properties"), barmode = "overlay", legend = list(x = 0.65, y = 0.95) )price_hist```The histogram reveals some fascinating patterns. Los Angeles properties are spread across a very wide price range, with many homes priced well over \$1 million. This confirms that LA is a premium market where even modest homes command extremely high prices.Nashville shows a more moderate distribution, with most homes clustering in the \$400,000 to \$700,000 range. This is still expensive by national standards, but much more accessible than Los Angeles.Oxford, Mississippi shows a tight cluster of much lower prices, with most homes falling between \$200,000 and \$400,000. This is the most affordable of our three markets.------------------------------------------------------------------------## Question 4: Is renting or buying more cost-effective in each market?Understanding home prices is important, but to truly assess affordability, we need to compare the cost of buying versus renting. I obtained Fair Market Rent (FMR) data from the U.S. Department of Housing and Urban Development (HUD) for each of these markets. The FMR represents what HUD considers a reasonable rent for a given area.By comparing the median home price to the annual rent, we can calculate a "price-to-rent ratio." This ratio tells us how many years of rent it would take to equal the purchase price of a home. Generally: - A ratio under 15 means buying is favorable - A ratio of 15-20 means it's roughly equal - A ratio over 20 means renting might be more economically favorable```{r rent-comparison}#| code-fold: true# Q4: Price-to-rent ratio analysisfmr_lookup <- data.frame( market = c("Los Angeles, CA", "Nashville, TN", "Oxford, MS", "Gallatin, TN"), county = c("Los Angeles County", "Davidson County", "Lafayette County", "Sumner County"), fmr_3br = c(3439, 2143, 1372, 1953))market_with_fmr <- market_summary %>% left_join(fmr_lookup, by = "market") %>% mutate(annual_rent = fmr_3br * 12, price_to_rent_ratio = round(median_price / annual_rent, 1))main_with_fmr <- market_with_fmr %>% filter(market %in% c("Los Angeles, CA", "Nashville, TN", "Oxford, MS"))ggplot(main_with_fmr, aes(x = reorder(market, price_to_rent_ratio), y = price_to_rent_ratio, fill = city_size)) + geom_col(width = 0.6) + geom_hline(yintercept = 15, linetype = "dashed", color = "#27AE60", size = 1) + geom_hline(yintercept = 20, linetype = "dashed", color = "#F39C12", size = 1) + geom_text(aes(label = round(price_to_rent_ratio, 1)), vjust = -0.5, size = 4, fontface = "bold") + scale_fill_manual(values = c("#9B59B6", "#3498DB", "#F39C12")) + scale_y_continuous(limits = c(0, 35), breaks = seq(0, 35, 5)) + annotate("text", x = 0.6, y = 15.5, label = "Favor Buying", color = "#27AE60", size = 3, hjust = 0) + annotate("text", x = 0.6, y = 20.5, label = "Consider Renting", color = "#F39C12", size = 3, hjust = 0) + labs(title = "Price-to-Rent Ratio by Market", subtitle = "Lower ratio = buying is more favorable relative to renting", x = "", y = "Price-to-Rent Ratio (Years)", fill = "City Size") + theme_minimal() + theme(axis.text.x = element_text(size = 10), legend.position = "top", plot.title = element_text(face = "bold", size = 14)) + coord_flip()```This chart shows the price-to-rent ratio for each market.These thresholds are standard industry guidelines used in rent vs. buy analysis.```{r rent-ratio-table}#| code-fold: true# Q4: Rent vs buy summary tablemain_with_fmr %>% select(market, city_size, median_price, fmr_3br, annual_rent, price_to_rent_ratio) %>% mutate( median_price = dollar(median_price), fmr_3br = dollar(fmr_3br), annual_rent = dollar(annual_rent), recommendation = case_when( price_to_rent_ratio < 15 ~ "Favor Buying", price_to_rent_ratio <= 20 ~ "About Equal", TRUE ~ "Consider Renting" ) ) %>% kbl( caption = "Price-to-Rent Analysis: Is Buying or Renting More Cost-Effective?", col.names = c("Market", "City Size", "Median Price", "Monthly Rent (3BR)", "Annual Rent", "Price/Rent Ratio", "Recommendation"), align = c("l", "l", "r", "r", "r", "c", "c") ) %>% kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>% row_spec(0, bold = TRUE, color = "white", background = "#34495E") %>% column_spec(6, bold = TRUE) %>% column_spec(7, bold = TRUE, color = "white", background = spec_color(c(25, 22, 15), scale_from = c(10, 30), palette = c("#27AE60", "#F39C12", "#E74C3C")))```- **Oxford, MS** has the lowest ratio, making it the most favorable market for buying. With a ratio around 15, the cost of buying is competitive with renting over the long term.- **Nashville, TN** falls in the middle with a ratio around 20-22. This is borderline - neither option has a clear advantage.- **Los Angeles, CA** has the highest ratio at around 25+, suggesting that renting might actually be more economical than buying in this expensive market. Surprising our current home buying market is not in a great state to actually buy a home.------------------------------------------------------------------------## Bonus: What about a small town near a large city?Up until now, we've compared three very different types of cities. But what if you dont want to be paying large city prices? This is Gallatin, Tennessee a surrounding small town ear Nashville TN. It offers access to Nashville's job market while potentially providing more affordable housing.Let's compare Gallatin directly to Nashville. To see if the commute is worth it:```{r nashville-gallatin-scatter}#| code-fold: true# Bonus: Nashville vs Gallatin price comparisontn_scatter_data <- zillow_filtered %>% filter(market %in% c("Nashville, TN", "Gallatin, TN")) %>% filter(!is.na(bedrooms), !is.na(price), bedrooms > 0) %>% mutate( price_category = case_when( price < 300000 ~ "Under $300K", price < 500000 ~ "$300K - $500K", price < 750000 ~ "$500K - $750K", TRUE ~ "$750K+" ), price_category = factor(price_category, levels = c("Under $300K", "$300K - $500K", "$500K - $750K", "$750K+")) )ggplot(tn_scatter_data, aes(x = bedrooms, y = price, color = price_category, shape = market)) + geom_point(alpha = 0.8, size = 4) + scale_y_continuous(labels = dollar_format()) + scale_x_continuous(breaks = 1:6) + scale_color_manual(values = c("#27AE60", "#3498DB", "#F39C12", "#E74C3C")) + scale_shape_manual(values = c(16, 17)) + labs( title = "Price vs Bedrooms: Nashville vs Gallatin", subtitle = "Comparing home prices by bedroom count in both markets", x = "Number of Bedrooms", y = "Price", color = "Price Category", shape = "Market" ) + theme_minimal() + theme(legend.position = "right", plot.title = element_text(face = "bold", size = 14))```The scatter plot shows how price varies with bedroom count in both markets. The colors represent price categories while the shapes circle(Gallatin) and triangles(Nashville) show the area. This show how affordable each bed room type between these 2 cities.```{r gallatin-comparison}#| code-fold: true# Bonus: Nashville vs Gallatin summarytn_detailed <- zillow_filtered %>% filter(market %in% c("Nashville, TN", "Gallatin, TN")) %>% group_by(market) %>% summarize( total_listings = n(), median_price = median(price, na.rm = TRUE), min_price = min(price, na.rm = TRUE), max_price = max(price, na.rm = TRUE), avg_sqft = mean(sqft, na.rm = TRUE), avg_bedrooms = round(mean(bedrooms, na.rm = TRUE), 1), avg_bathrooms = round(mean(bathrooms, na.rm = TRUE), 1), avg_price_per_sqft = mean(price_per_sqft, na.rm = TRUE), .groups = "drop" ) %>% mutate(city_type = ifelse(market == "Nashville, TN", "Large City", "Small Town"))tn_detailed %>% mutate( median_price = dollar(median_price), min_price = dollar(min_price), max_price = dollar(max_price), avg_sqft = comma(round(avg_sqft, 0)), avg_price_per_sqft = dollar(round(avg_price_per_sqft, 0)) ) %>% select(market, city_type, total_listings, median_price, min_price, max_price, avg_sqft, avg_bedrooms, avg_bathrooms, avg_price_per_sqft) %>% kbl( caption = "Nashville vs Gallatin: Does Living in a Small Town Save Money?", col.names = c("Market", "City Type", "Listings", "Median Price", "Min Price", "Max Price", "Avg Sq Ft", "Avg Beds", "Avg Baths", "Price/Sq Ft"), align = c("l", "l", "c", "r", "r", "r", "r", "c", "c", "r") ) %>% kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>% row_spec(0, bold = TRUE, color = "white", background = "#2C3E50") %>% row_spec(2, background = "#D5F5E3")```The comparison reveals that Gallatin offers noticeable savings compared to Nashville. While the difference might not be as dramatic but it still allows people to have access to larger cities job market and activities without living in the night life of the city.------------------------------------------------------------------------## ConclusionAfter analyzing housing data across Los Angeles (mega city), Nashville (large city), Oxford (small city), and Gallatin (small town), several clear conclusions emerge about housing affordability:**Key Findings:**1. **City size directly correlates with price.** Los Angeles homes average significantly more than Nashville, which in turn is more expensive than Oxford. This confirms the intuitive relationship between population density and housing costs.2. **Price-to-rent ratios vary dramatically.** In expensive markets like Los Angeles, renting may actually be more economical than buying when you factor in the extreme home prices. In smaller markets like Oxford, buying becomes much more attractive.3. **Small towns near large cities offer a middle ground.** Gallatin provides access to Nashville's economy while offering somewhat lower housing costs. This "bedroom community" approach is worth considering for those who don't need to live in the city center.4. **Square footage don't scale linearly with price.** In Los Angeles, you pay much more per square foot than in other markets. A modest home in LA might cost more than a mansion in Oxford.